Skip to content

Conversation

chloechia4
Copy link
Contributor

@chloechia4 chloechia4 commented Oct 1, 2025

Description

This PR includes tests for the low-level bindings and the generated low-level bindings introduced in CUDA 13.0 for CUFile.

CUDA 13.0 CuFile Operations

test_set_stats_level
test_stats_start
test_stats_stop
test_stats_reset
test_get_stats_l1
test_get_stats_l2
test_get_stats_l3
test_get_bar_size_in_kb
test_set_parameter_posix_pool_slab_array
test_set_get_parameter_size_t

Note: The original test_batch_io_large_operations() did not pass once switched from CUDA 12.9 to 13.0. I realized it was because the file reads were occurring before the writes as it was submitting all operations (reads and writes) together in one batch. As a result, it was trivially failing because the reads would return as 0 bytes, since they were happening before any write I/O occurred. I changed it to so it would be separated into two phases: writes complete first in one batch handle, and then reads are submitted in another batch handle. This new test works with CUDA 12.9 versioning as well.

All tests passing across CUDA versions
image

@copy-pr-bot
Copy link
Contributor

copy-pr-bot bot commented Oct 1, 2025

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

@leofang
Copy link
Member

leofang commented Oct 1, 2025

Thanks, Chloe! Pinning you internally...

@leofang leofang requested review from cpcloud and mdboom October 1, 2025 23:03
@leofang leofang added P0 High priority - Must do! feature New feature or request cuda.bindings Everything related to the cuda.bindings module labels Oct 1, 2025
@leofang leofang added this to the cuda-python parking lot milestone Oct 1, 2025
@leofang
Copy link
Member

leofang commented Oct 17, 2025

/ok to test e1eabf8

Comment on lines 1954 to 1955
# Reset to level 0 (disabled) for cleanup
cufile.set_stats_level(0)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is still unresolved.

@pytest.mark.skipif(
cufileVersionLessThan(1150), reason="cuFile parameter APIs require cuFile library version 13.0 or later"
)
def test_stats_start():
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@chloechia4 Please remove this test entirely. I don't see any change?

Comment on lines 2025 to 2026
# Set statistics level first (required before starting stats)
cufile.set_stats_level(1) # Level 1 = basic statistics
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where...? @chloechia4 if you made the changes locally, make sure you push it to remote.

Comment on lines +2062 to +2063
# Set statistics level first (required before starting stats)
cufile.set_stats_level(1) # Level 1 = basic statistics
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this has not happened yet

Comment on lines +2106 to +2107
# Reset cuFile statistics to clear all counters
cufile.stats_reset()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto

fd = os.open(file_path, os.O_CREAT | os.O_RDWR | os.O_DIRECT, 0o600)

try:
cufile.set_stats_level(2) # L2 = detailed performance metrics
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this needs to be restored

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you mean reset(). Adding a stats_reset() in "finally" section

@mdboom
Copy link
Contributor

mdboom commented Oct 20, 2025

/ok to test

@copy-pr-bot
Copy link
Contributor

copy-pr-bot bot commented Oct 20, 2025

/ok to test

@mdboom, there was an error processing your request: E1

See the following link for more information: https://docs.gha-runners.nvidia.com/cpr/e/1/

@mdboom
Copy link
Contributor

mdboom commented Oct 20, 2025

/ok to test 65573fc

@mdboom
Copy link
Contributor

mdboom commented Oct 20, 2025

/ok to test 642be44

@mdboom
Copy link
Contributor

mdboom commented Oct 20, 2025

/ok to test a9584c8

@copy-pr-bot
Copy link
Contributor

copy-pr-bot bot commented Oct 20, 2025

/ok to test a9584c8

@mdboom, there was an error processing your request: E2

See the following link for more information: https://docs.gha-runners.nvidia.com/cpr/e/2/

@mdboom
Copy link
Contributor

mdboom commented Oct 20, 2025

/ok to test a9584c8

@mdboom
Copy link
Contributor

mdboom commented Oct 20, 2025

/ok to test 3f81345

@mdboom
Copy link
Contributor

mdboom commented Oct 20, 2025

/ok to test ed6eeb2

@mdboom mdboom enabled auto-merge (squash) October 20, 2025 18:26
@mdboom
Copy link
Contributor

mdboom commented Oct 20, 2025

/ok to test 125f968

@mdboom mdboom merged commit 002a069 into NVIDIA:main Oct 20, 2025
71 checks passed
@github-actions
Copy link

Doc Preview CI
Preview removed because the pull request was closed or merged.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

cuda.bindings Everything related to the cuda.bindings module feature New feature or request P0 High priority - Must do!

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants